Skip to content

Generator2: improvements and fixes#28

Merged
sebastian-nagel merged 4 commits into
ccfrom
generator2-counter-url-filters-rejected
May 6, 2024
Merged

Generator2: improvements and fixes#28
sebastian-nagel merged 4 commits into
ccfrom
generator2-counter-url-filters-rejected

Conversation

@sebastian-nagel

@sebastian-nagel sebastian-nagel commented Apr 28, 2024

Copy link
Copy Markdown

Improvements and fixes to Generator2 (CC's modification of Nutch's original fetch list Generator):

  • count URLs rejected by URL filters: add counters URL_FILTERS_REJECTED and URL_FILTER_EXCEPTION. Cf. NUTCH-3043 for the analogous implementation in Generator.
  • bug fix: make optional URL normalization work: pass configuration property set via command-line flag (-noNorm) to the class SelectorMapper
  • refactoring: add override annotations, remove unnecessary casts, simplify logging statements

- add counters URL_FILTERS_REJECTED and URL_FILTER_EXCEPTION
- pass the configuration property set via command-line flag
  (-noNorm) forward to the class SelectorMapper
@sebastian-nagel sebastian-nagel merged commit d6e6c75 into cc May 6, 2024
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

Projects

None yet

Development

Successfully merging this pull request may close these issues.

1 participant